Efficient Matrix-Encoded Grammars and Low Latency Parallelization Strategies for CYK

نویسندگان

  • Aaron Dunlop
  • Nathan Bodenstab
  • Brian Roark
چکیده

We present a matrix encoding of contextfree grammars, motivated by hardware-level efficiency considerations. We find efficiency gains of 2.5–9× for exhaustive inference and approximately 2× for pruned inference, resulting in high-accuracy parsing at over 20 sentences per second. Our grammar encoding allows fine-grained parallelism during chart cell population; we present a controlled study of several methods of parallel parsing, and find nearoptimal latency reductions as core-count increases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iterative CKY Parsing for Probabilistic Context-Free Grammars

This paper presents an iterative CKY parsing algorithm for probabilistic contextfree grammars (PCFG). This algorithm enables us to prune unnecessary edges produced during parsing, which results in more efficient parsing. Since pruning is done by using the edge’s inside Viterbi probability and the upper-bound of the outside Viterbi probability, this algorithm guarantees to output the exact Viter...

متن کامل

To CNF or not to CNF? An Efficient Yet Presentable Version of the CYK Algorithm

The most familiar algorithm to decide the membership problem for context-free grammars is the one by Cocke, Younger and Kasami (CYK) using grammars in Chomsky normal form (CNF). We propose to teach a simple modification of the CYK algorithm that uses grammars in a much less restrictive binary normal form (2NF) and two precomputations: the set of nullable nonterminals and the inverse of the unit...

متن کامل

Principled Parsing for Indentation-Sensitive Languages

Many languages, such as Haskell, Python, and F#, use the indentation and layout of code as part of their syntax. Because context-free grammars are not able to express these layout rules, existing parsers use ad hoc techniques to handle them. These techniques tend to be low-level and operational in nature, and thus forgo the advantages of more declarative specifications like context-free grammar...

متن کامل

Efficient Implementation of the Cky Algorithm

When the CKY algorithm is presented in Natural Language Processing literature, it is often is described in high-level pseudo code. The implementation details of the CKY algorithm, despite being critical to efficiency, are rarely (if ever) discussed. In this paper I discuss multiple implementation approaches, and optimizations on these approaches to increase parsing time an order of magnitude wh...

متن کامل

Beam-Width Prediction for Efficient Context-Free Parsing

Efficient decoding for syntactic parsing has become a necessary research area as statistical grammars grow in accuracy and size and as more NLP applications leverage syntactic analyses. We review prior methods for pruning and then present a new framework that unifies their strengths into a single approach. Using a log linear model, we learn the optimal beam-search pruning parameters for each CY...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011